Protein sliding and hopping kinetics on DNA 
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Using Monte Carlo simulations, we deconvolved the sliding and hopping kinetics of GFP-LacI 
proteins on elongated DNA from their experimentally observed seconds-long diffusion trajectories. 
Our simulations suggest the following results: (1) in each diffusion trajectory, a protein makes on 
average hundreds of alternating slides and hops with a mean sliding time of several tens of ms; (2) 
sliding dominates the root mean square displacement of fast diffusion trajectories, whereas hopping 
dominates slow ones; (3) flow and variations in salt concentration have limited effects on hopping 
kinetics, while in vivo DNA configuration is not expected to influence sliding kinetics; furthermore, 
(4) the rate of occurrence for hops longer than 200 nm agrees with experimental data for EcoRV 
proteins. 

PACS numbers: 87.15.A-, 87.15.hg, 87.10.Rt, 0.5.40.Fb 



I. INTRODUCTION 

Timely target association of DNA-binding (DB) pro- 
teins is important for prompt cellular response to exter- 
nal stimuli using mechanisms such as gene regulation, 
DNA replication, and DNA repair. The target associ- 
ation rates of DB proteins frequently deviate from the 
diffusion limit due to their interactions with nonspecific 
DNA via the process of facilitated diffusion PUS] . Facil- 
itated diffusion mainly consists of two motions: sliding, 
where a protein diffuses along nonspecific DNA without 
losing contact, and hopping, where the protein jumps off 
DNA and undergoes 3D diffusion before reassociating to 
the same (Fig. [l} or a different segment of DNA (re- 
ferred to as intersegmental transfer). In this article, we 
regard events with long hopping distances, usually called 
jumping, as a form of hopping. A DB protein may slide 
and hop many times on nonspecific DNA before reaching 
the target. In order to quantify the effect of facilitated 
diffusion on DB proteins' target binding rate, how long a 
protein spends sliding on DNA (mean sliding time (ii)) 
and how fast it moves along DNA (sliding diffusion coef- 
ficient D\) are two critical parameters for all calculations 
of in vitro and in vivo DNA geometries [2j HHE] . 

Single-molecule (SM) fluorescence imaging studies of 
DB proteins' Brownian diffusion along elongated DNA 
have obtained effective diffusion coefficients D for the 
whole seconds-long diffusions (in this article we define 
each observed diffusion event between protein associa- 
tion and permanent dissociation to be a diffusion tra- 
jectory, and t is the total time of the diffusion) [3l 03- 
121] . In the past, numerous studies had substituted t and 
D values in the place of (t±) and D\ in target binding 
rate and protein-nonspecific-DNA binding energy calcu- 
lations since {t\) and D\ were not experimentally accessi- 
ble [1I5HE1IMII1H3I12. Since the extent of hopping in- 
volvement is unknown, it is dubious to use t and D values 
for (ti) and D\. Recent evidence suggests that these dif- 
fusion trajectories include both sliding and hopping: (1) 
the sliding time of DB proteins has been estimated to be 



milliseconds [6] [12j [22j [23] ; (2) the sliding displacement 
has been estimated to be less than 50 bp [23], shorter 
than the displacements of whole diffusion trajectories of 
the reported DB proteins (> 100 nm); (3) hops longer 
than 200 nm have been observed [15] . In order to obtain 
(ti) and Di from experimental data, deconvolving slid- 
ing and hopping from individual diffusion trajectories is 
necessary. 



II. SIMULATIONS 

Here we deconvolve sliding and hopping in a diffusion 
trajectory and obtain (ti) and Di using (i) Monte Carlo 
simulations, (ii) experimental D and t values, and (hi) 
the following two relations (derived in |25|): 

t = N(t 1 )+N(t 3 ) , (1) 
2Dt = 2D x N{ti) + 2D 3 N(t 3 ) , (2) 

where N is the mean number of sliding and hopping al- 
ternations in a diffusion trajectory, D 3 is the 3D diffusion 




FIG. 1. (Color online) Schematics of a diffusion trajectory 
showing a protein initially binding to DNA, proceeding to 
slide (light disks) and hop (dark disks), and finally perma- 
nently dissociating from DNA. This example diffusion trajec- 
tory has two discernible hops. 
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coefficient of the protein, and (£3) is the mean hopping 
time. From hopping simulations we first determine N 
and (£3); then combining with experimental D and t val- 
ues, t\ and Di are obtained using Eqs. [I] and [2] 

For each hopping simulation, a protein was initially po- 
sitioned at the protein-center to DNA-center distance of 
R = 7'DNA + ^protoin + Ar, where t D na = 1 nm is the DNA 
radius, rGFP-Laci = 2.68 nm, and Ar 0.5 nm is an es- 
timate of the protein-DNA binding distance (or location 
of the interaction potential minimum beyond which we 
consider no protein-DNA interactions) J26[ [27] . The pro- 
tein immediately dissociates from DNA and undergoes 
3D diffusion until rebinding to DNA, at which time the 
position was recorded, or until the maximum number of 
steps of the hopping simulation was reached in which case 
the protein was assumed to have permanently dissociated 
and its diffusion trajectory was not used in subsequent 
data analysis. Figure [2] describes the criterion for deter- 
mining whether a hopping protein collided with DNA. 
For every step, the length of the perpendicular drawn 
from the center of the DNA to the line connecting the 
last two protein locations (dashed arrow) was calculated 
and if less than R, association occurred. The binding 
position was chosen to be the midpoint between the two 
protein locations. We have modeled DNA as an infinite, 
rigid cylinder assuming 100% probability for association 
upon protein-DNA collision; the distance between the 
protein binding location and its origin denotes the hop- 
ping distance. 

The simulation parameters were determined as follows. 
The hopping simulation step size 5, and step time r, 
are the collision distance and time, respectively [28]. At 
temperature T = 294K, the instantaneous velocity of a 
protein of mass m in solution is the root mean square 
(rms) velocity (i> 2 ) = \JksT Jm = 5/t = 6.02 m/s, 
where ks is the Boltzmann constant, m = 67.5 kDa for 
a GFP-LacI monomer. Using the Einstein-Stokes rela- 
tion, L> 3 = <5 2 /(2t) = kBT/Gnrjr = 8.03 x 10 7 nm 2 /s 
for GFP-LacI where the viscosity of water is rj = 10~ 3 
N s/m 2 and the protein hydrodynamic radius r is 2.68 
nm assuming a typical protein density of 1.38 g/cm 3 , we 
obtain S = 2D 3 /^/Jv^. Therefore, S = 0.267 A and 
t = 4.46 ps. Each simulation step in the x, y, z di- 
mensions was drawn from a Gaussian distribution with 




FIG. 2. (Color online) Determination of protein-DNA asso- 
ciation. The gray (open) circle marks the effective protein- 
DNA binding distance. The protein moves ballistically be- 
tween consecutive steps. 



a mean of zero and a standard deviation of 5. 

The time limit for simulation of each GFP-LacI hop 
was « 1 ms (or 2.1 x 10 s steps), selected according to the 
following two estimations: (1) Since the observed diffu- 
sion of proteins on DNA is the combination of sliding 
and hopping with diffusion coefficients D\ and D3, re- 
spectively, the maximum total hopping time of a diffu- 
sion trajectory cannot exceed Nt^^max = Dt/D$ when 
D l « 0. For GFP-LacI, (D) « 2 x 10 4 nm 2 /s [3] which 
dictates that t^ tmax 0.25 ms when t is on the order of 1 
s and using the low bound for N of one hop per diffusion 
trajectory. Therefore, a hopping time limit of t 3 . ma x ~ 1 
ms for a single hop should be sufficiently long for all 3D 
diffusing proteins to return to DNA. (2) A longer hop- 
ping time limit, such as 10 ms per hop (data not shown), 
results in additional proteins returning to DNA with in- 
dividual hopping distances longer than y / 2(D)t = 200 
nm, a detectable distance in SM measurements that are 
usually used to separate single diffusion trajectories into 
segments free of large displacements for accurate D anal- 
ysis [ana. 



III. RESULTS AND DISCUSSION 

For 4 x 10 5 GFP-LacI hopping simulations (maximum 
simulation time t^ <max « 1 ms) with S = 0.267 A and 
R = 4.2 nm, 99.809% of these trials resulted in the pro- 
tein reassociating to DNA and thus the probability for 
a simulated hop to return to DNA is P = 0.99809. The 
hopping characteristics are shown in Figs. [3]A_ and [3)3, 
in which the mean hopping distance along DNA is 3.37 
A (median, 0.41 A), the mean hopping height (the maxi- 
mum radial distance of the protein from DNA) is 4.93 A 
(median, 0.45 A) , and the mean number of steps per hop 
is 4.97 x 10 4 (median, 5), yielding a mean hopping time 
(£3) = 0.22 /is. The mean number of hops in a GFP-LacI 
diffusion trajectory is N = 526 obtained by dividing the 
total number of simulated hops of 4 x 10 5 by the total 
number of non-returned hopping events of 763; the dis- 
tribution for the number of hops per diffusion trajectory 
is shown in Fig. [4j This set of values have been verified 
to converge with those from a larger simulation of 4 x 10 6 
hops. Specifically, N values differ by 0.57%. The inset 
of Fig. [3] shows the distribution of total hopping dis- 
placements in a diffusion trajectory with each data point 
simulated from 526 randomly selected hopping displace- 
ments. The rms total hop ping displa cement per diffusion 
trajectory is 127.5 nm ( ^2D^N (t^)), and the mean to- 
tal hopping time is N(t 3 ) — 115 fis. Note that although 
shorter hopping distances, such as ones less than the base 
pair length of 0.34 A, do not carry direct biological sig- 
nificance nor do they noticeably disrupt sliding, they are 
important for correctly assessing rms total hopping dis- 
placement statistics in a diffusion trajectory. 

We can also compute the 'diffusion to capture' proba- 
bility P for a protein to return to DNA using a steady- 
state solution to the diffusion equation, incorporating 
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a cutoff radial distance c [55] • Proteins released after 
the initial step at b — 4.22 nm are either adsorbed at 
the DNA surface (R = 4.2 nm) or escape beyond c = 
R + \/4:D3t3,max- The probability is time-independent 
and given by 



log(c/6) 
log(c/i?) 



0.99896 



(3) 



Imposing the same cutoff distance c = 551.2 nm in sub- 
sequent simulations, we obtained P = 0.99865, in near 
agreement with the analytical value above. 

Having obtained (£3) and N from simulation, we now 
solve Eqs. [I] and [2] for (£1) and D\ from the experimen- 
tally measured values of t and D. With values of D for 
GFP-LacI ranging from 2.3 x 10 2 to 1.3 x 10 5 nm 2 /s [3] 



and t = 10.4 s (Fig. [3p), 
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The sliding time is several tens of ms and D\ ranges 
from s» for slow diffusion to ps D for fast diffusion. The 
(Di) for GFP-LacI is 9.1 x 10 3 nm 2 /s using (D) of 10 4 
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FIG. 3. (Color online) (A) Distributions of hopping dis- 
tances along DNA for S = 0.267 A and R = 4.2 (green, open 
circles) and 10.2 nm (red dots), and hopping height for R = 
4.2 nm (gray line). (B) Distributions for number of steps per 
hop for R = 4.2 and 10.2 nm. Inset, distribution for total 
hopping displacement per diffusion trajectory and Gaussian 
fit (solid line). (C) Number of hops per diffusion trajectory 
longer than 0.25 A, and up to hops longer than 800 nm, for 
R — 4.2 and 10.2 nm. The crosses are experimental data for 
EcoRV proteins, where the occurrence rate of hops per dif- 
fusion trajectory longer than 200 nm are 0.06, 0.1, and 0.16 
(the 0.15 value was omitted for clarity) [IB] . (D) GFP-LacI 
total diffusion time t distribution (from experimental data in 
Ref. [3]). The mean of the exponential fit (solid line) is 10.4 
s. 
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FIG. 4. (Color online) Distribution of number of hops per 
diffusion trajectory. The results of 4 x 10 individual hopping 
simulations constitute a total of 763 protein diffusion trajec- 
tories such that 526 hops occur on average per trajectory. 



nm 2 /s. Since D\ > 0, Eq. ^ sets the lower bound of 
D such that it must be greater than D3N '(£3} /£ ss 896 
nm 2 /s. The rms total sliding displacement in a diffusion 
trajectory becomes longer than the rms total hopping 
displacement when D > 2ND 3 t 3 /t sa 1790 nm 2 /s. 

Since our protein-nonspecific-DNA binding distance is 
an estimate, we have carried out simulations with Ar 
ranging from 0.5 to 6.5 nm (corresponding to protein- 
DNA distances R of 4.2 and 10.2 nm, respectively). Com- 
paring the R — 10.2 nm results to the R = 4.2 nm results, 
the distributions for hopping distances (Fig. [3]^) and 
hopping times (Fig. |3j3) are similar, although the mean 
hopping distance reduces to 2.82 A, the mean number of 
steps per hop reduces to 3.23 x 10 4 , and the mean number 
of hops N, doubles to 1101. Solving for (£1) and D\ at 
R = 10.2 nm, we found (£3) = 0.14 fjs, N(t 3 ) = 154 fjs, 
(£1) = 9.4 ms (approximately half of the value for R = 4.2 
nm), and D\ to be similar to the previously calculated 
value for R = 4.2 nm. Given that the sliding and hopping 
values at R = 4.2 and 10.2 nm are close, our method and 
results can be safely applied to most DB protein-DNA 
binding distances. 

To investigate hopping distances within a diffusion tra- 
jectory, Fig. [3p shows the distribution of the number of 
hops per diffusion trajectory longer than a finite hopping 
distance, ranging from 0.25 A to 800 nm, for R = 4.2 and 
10.2 nm. For the 4.2 nm results, 3.37 hops in a diffusion 
trajectory were longer than 5 nm, and 11% of diffusion 
trajectories had a hop longer than 200 nm. As expected, 
the results for 10.2 nm are approximately twice as large 
since N is doubled. The crosses represent EcoRV pro- 
teins, which have a comparable hydrodynamic radius of 
2.66 nm (see Table [jj, that were experimentally observed 
in different buffers to have hopped longer than 200 nm 
with reported occurrences ranging from 6 to 16% per dif- 
fusion trajectory [15]. These observations are in agree- 
ment with our simulations results. Furthermore, for hops 
longer than 300 nm and 500 nm, our observations agree 
with the reported values in Fig. 4A of Ref. [IS] . 



Other DB proteins may differ from GFP-LacI in their 
sizes, and thus S and R. Table [I] lists DB proteins that 
can hop on DNA (instead of proteins that slide only [TT]) 
studied using SM fluorescence tracking methods on elon- 
gated DNA. Despite the difference in R by up to 1.26 
nm, the 5 values differ only by less than 0.07 A. The ef- 
fect of R difference is considered in Fig. [5]A, in which 
the number of hops per diffusion trajectory longer than 
a finite distance, ranging from 0.1 A to 800 nm for 5 
= 0.267 A and R from 4.2 to 10.2 nm are shown. The 
number of hops per diffusion trajectory increases with 
R moderately for all hopping distances, indicating that 
our hopping results are applicable to most observed DB 
proteins. 

The step size S in the current approach, based on mi- 
croscopic Brownian random walk models, can be made 
larger or smaller for vastly different particle sizes. Fig- 
ure]^ shows distributions of hopping distances for three 
S values: 0.267, 3.4, and 10 A (we used R = 4.2 nm 
and ts^nax ~ 1 ms). The distribution curves collapse 
when protein hopping distances are larger than 5, indi- 
cating that the tail distribution of protein hopping prob- 
ability has the same asymptotic form at long distances, 
in agreement with the solution to the diffusion equation 
[313] . However, the mean hopping distance (Fig. [5j3 inset; 
values are 3.37, 36, and 95 A), the mean number of hops 
N, in a trajectory (526, 42, and 14), and (t 3 ) (0.22, 3.1, 
and 9.2 ps) all depend on S sensitively, as short-length 
scale motions dominate protein-DNA reassociation (Fig. 
[3]A) . This regime can not be accessed in the macroscopic 
theory, i.e., by solving the diffusion equation directly. 

When the protein-nonspecific-DNA association proba- 
bility p, is not 100%, e.g., due to rotation of the DNA- 
binding domain during large hops, hopping statistics and 
the subsequent sliding statistics will change. For a low 
binding probability of p = 10%, although on average, 
ten consecutive hops would be needed for reassociation, 
the mean number of association attempts will still be N. 



TABLE I. DB protein diffusion properties on elongated DNA. 
Protein r pro tein (nm) 5 (A) D (nm 2 /s) 



YFP-LacI 
GFP-LacI 
EcoRV, 2 
EcoKVf] 
RNAP 

rnafF 

hOggl 
p53 
UL42 
L7 g P 5, 
T7 gp5, 
C-Ada 



3.13 
2.68 
2.66 



2.36 
2.34 
2.63 
2.86 
3.00 
1.77 



0.284 
0.267 
0.262 



0.247 
0.246 
0.261 
0.272 
0.278 
0.214 



4.6 xlO 4 Pf 
2.3xl0 2 - 1.3xl0 5 [3] 
0.9 -2.5xl0 4 Q2] 
3.1xl0 3 PH 
6. lxlO 3 -4.3x10 s P3] 
1.3x10 s [H], ~10 4 [S] 
5.78 x10 s 
3.01x10 s 



8.0x10° 



1. 86x10" 
4.0x10 s 
1.3xl0 6 



M 
021 



However, the effective mean hopping time (t' 3 ), and the 
mean hopping distance are expected to increase while 
the effective number of hops per diffusion trajectory N' , 
decreases since t is held constant. The effective total 
hopping time N'(t' 3 ), and the rms total hopping distance 
per diffusion trajectory should therefore remain constant. 
The binding probability is thus inversely related to the 
effective mean sliding time (t'i), according to Eq. [2]which 
for p = 10% results in a 10-fold increase in (t[). 

When salt concentration varies, p and R will change, as 
will D 3 within a few angstroms of the DNA surface. How- 
ever, since t remains s» N' (t[) because N'(t' 3 ) -C N'(ti), 
the observed changes in t with salt concentration are 
likely due to changes in the total sliding time rather than 
the total hopping time. Consequently, changes in t as a 
result of varying salt concentration are not indicative of 
hopping and should not be used to determine its pres- 
ence in diffusion trajectories, in disagreement with Rcfs. 

pnansniiciT]. 

Some studies use flow to elongate DNA and/or in- 
vestigate hopping properties of DB proteins [10l Ull [20l 
|2"T1 |3"T] . Here we describe the effect of flow on hopping 
distances using the maximum reported flow rate in SM 
studies of 100 /zm/s. For our mean hopping time of 
(t 3 ) = 0.22 /is, a typical dissociated protein is carried 
by flow a length 0.22 A along DNA; this distance is neg- 
ligible compared to its mean hopping distance of 3.37 A 
(the total displacement of the protein from flow alone 
within a diffusion trajectory consisting of 526 hops will 
be 11.6 nm which is substantially less than the total hop- 
ping displacement of 127.5 nm observed for GFP-LacI 
and similarly other proteins, as shown above). On the 
other hand, for a trajectory that includes a 1 ^m-long 
hop, which occurs once every 1000 diffusion trajectories, 
the hopping time is 6.22 ms and a protein is flown 622 
nm along DNA. This distance would be sufficiently large 
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FIG. 5. (Color online) Distributions for number of hops per 
diffusion trajectory longer than 0.1, 0.34, 1, 5, 10, 20, 50, 100, 
200, 300, 500, and 800 nm (top to bottom in A), (A) for R 
ranging from 4.2 to 10.2 nm (left to right) and (B) for R = 4.2 
nm and 8 — 0.267 (circles), 3.4 (empty squares), and 10.2 A 
(crosses). Inset, hopping distance distributions for the three 
S values. 
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for the protein to be considered dissociated. 

Our results suggest that for diffusion trajectories with- 
out large hops of longer than of order a few hundred 
nanometers, a protein is unlikely to have been "washed 
out" while those that include large hops, the protein may 
be. However, according to Fig. [3p, the probability for 
such an event to occur is approximately one percent of 
all diffusion trajectories. 

Furthermore, sliding kinetics are not expected to be 
drastically affected by DNA configuration since a protein 
remains in contact with nonspecific DNA and should not 
be subject to DNA condensation and coiling either in vivo 
or in vitro, contrary to hopping kinetics. The reported 
values for D\ and t can therefore be applied under in vivo 
situations for better estimation of target binding rates. 



have made several assumptions regarding the nature of 
protein association and modeling DNA, our study sug- 
gests that the observed sliding kinetics is a robust fea- 
ture. Although hopping kinetics will change according 
to in vivo conditions, the lower bound on D for a typical 
DB protein should help future experiments in identifying 
the presence of hopping in protein diffusion trajectories 
with greater certainty. 
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